对比度学习(CL)最近已应用于对抗性学习任务。这种实践将对抗样本视为实例的其他积极观点,并且通过彼此达成最大的协议,可以产生更好的对抗性鲁棒性。但是,由于对抗性扰动可能会导致实例级别的身份混乱,因此这种机制可能存在缺陷,这可能会通过用单独的身份将不同的实例聚集在一起来阻碍CL性能。为了解决这个问题,我们建议在形成鲜明对比时不平等地对待对抗样本,与不对称的Infonce目标($ a-Infonce $)允许区分对抗样本的考虑。具体而言,对手被视为降低的阳性,会引起较弱的学习信号,或者是与其他负面样本形成较高对比的艰难负面因素。以不对称的方式,可以有效地减轻CL和对抗性学习之间相互冲突目标的不利影响。实验表明,我们的方法始终超过不同鉴定方案的现有对抗性CL方法,而无需额外的计算成本。提出的A-INFONCE也是一种通用形式,可以很容易地扩展到其他CL方法。代码可从https://github.com/yqy2001/a-infonce获得。
translated by 谷歌翻译
本文实质上扩展了我们在ECCV上发布的工作,其中提出了中级攻击以提高某些基线对抗示例的可转移性。具体而言,我们提倡一个框架,在该框架中,建立了从中间级别差异(对抗特征和良性特征之间)的直接线性映射到建立对抗性示例的预测丢失。通过深入研究这种框架的核心组成部分,我们表明1)可以考虑各种线性回归模型以建立映射,2)最终获得的中间级别对手差异与与之相关。 3)可以通过随机初始化进行多次基线攻击来实现性能的进一步提高。此外,通过利用这些发现,我们在基于转移的$ \ ell_ \ infty $和$ \ ell_2 $攻击方面实现了新的最先进。我们的代码可在https://github.com/qizhangli/ila-plus-plus-lr上公开获取。
translated by 谷歌翻译
Metaheuristics are popularly used in various fields, and they have attracted much attention in the scientific and industrial communities. In recent years, the number of new metaheuristic names has been continuously growing. Generally, the inventors attribute the novelties of these new algorithms to inspirations from either biology, human behaviors, physics, or other phenomena. In addition, these new algorithms, compared against basic versions of other metaheuristics using classical benchmark problems without shift/rotation, show competitive performances. In this study, we exhaustively tabulate more than 500 metaheuristics. To comparatively evaluate the performance of the recent competitive variants and newly proposed metaheuristics, 11 newly proposed metaheuristics and 4 variants of established metaheuristics are comprehensively compared on the CEC2017 benchmark suite. In addition, whether these algorithms have a search bias to the center of the search space is investigated. The results show that the performance of the newly proposed EBCM (effective butterfly optimizer with covariance matrix adaptation) algorithm performs comparably to the 4 well performing variants of the established metaheuristics and possesses similar properties and behaviors, such as convergence, diversity, exploration and exploitation trade-offs, in many aspects. The performance of all 15 of the algorithms is likely to deteriorate due to certain transformations, while the 4 state-of-the-art metaheuristics are less affected by transformations such as the shifting of the global optimal point away from the center of the search space. It should be noted that, except EBCM, the other 10 new algorithms proposed mostly during 2019-2020 are inferior to the well performing 2017 variants of differential evolution and evolution strategy in terms of convergence speed and global search ability on CEC 2017 functions.
translated by 谷歌翻译
反事实是一种新兴的模型解释类型,最近引起了行业和学术界的大量关注。与传统的基于特征的解释(例如,归因)不同,反事实是一系列假设样本,可以将模型决策翻转而对查询的扰动最小。鉴于有效的反事实,人类能够在``假设的情况''的情况下进行推理,以便更好地理解模型决策边界。但是,释放反事实可能是有害的,因为它可能无意间泄漏敏感信息给对手,这给模型安全性和数据隐私带来了更高的风险。为了弥合差距,在本文中,我们提出了一个新颖的框架,以生成不同的私人反事实(DPC),而无需触摸已部署的模型或解释集,在该集合中注入了噪音以进行保护,同时保持反事实的解释作用。特别是,我们使用功能机制训练自动编码器来构建嘈杂的类原型,然后根据差异隐私的后处理免疫从潜在原型中得出DPC。进一步的评估证明了拟议框架的有效性,表明DPC可以成功缓解提取和推理攻击的风险。
translated by 谷歌翻译
受益于医疗保健数据的数字化和计算能力的发展,机器学习方法越来越多地用于医疗领域。在医疗保健机器学习中已经确定了公平性问题,导致对有限医疗资源的不公平分配或某些群体的健康风险过多。因此,解决公平问题最近引起了医疗保健社区的越来越多的关注。然而,机器学习的机器学习与机器学习中的公平性的交集仍在研究中。在这篇综述中,我们通过暴露公平问题,总结可能的偏见,整理缓解方法并指出挑战以及未来的机会来建立桥梁。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments. To tackle this issue, we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances. We show that by doing so, one of the largest, state-of-the-art models (GPT3) is capable of generating reasonable drum grooves, while models that are not pre-trained (Transformer) shows no such ability beyond naive repetition. Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method and analyze drum grooves produced by GPT3 compared to those played by human professionals, exposing the strengths and weaknesses of such generation by language-to-music transfer. Our findings suggest that language-to-music transfer learning with large language models is viable and promising.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译